Approximate P-Values for Local Sequence Alignments: Numerical Studies

نویسندگان

  • John D. Storey
  • David Siegmund
چکیده

Siegmund and Yakir (2000) have given an approximate p-value when two independent, identically distributed sequences from a finite alphabet are optimally aligned based on a scoring system that rewards similarities according to a general scoring matrix and penalizes gaps (insertions and deletions). The approximation involves an infinite sequence of difficult-to-compute parameters. In this paper, it is shown by numerical studies that these reduce to essentially two numerically distinct parameters, which can be computed as one-dimensional numerical integrals. For an arbitrary scoring matrix and affine gap penalty, this modified approximation is easily evaluated. Comparison with published numerical results show that it is reasonably accurate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance.

We present a novel method for the comparison of multiple protein alignments with assessment of statistical significance (COMPASS). The method derives numerical profiles from alignments, constructs optimal local profile-profile alignments and analytically estimates E-values for the detected similarities. The scoring system and E-value calculation are based on a generalization of the PSI-BLAST ap...

متن کامل

COMPASS server for remote homology inference

COMPASS is a method for homology detection and local alignment construction based on the comparison of multiple sequence alignments (MSAs). The method derives numerical profiles from given MSAs, constructs local profile-profile alignments and analytically estimates E-values for the detected similarities. Until now, COMPASS was only available for download and local installation. Here, we present...

متن کامل

H-tuple approach to evaluate statistical significance of biological sequence comparison with gaps.

We propose an approximate distribution for the gapped local score of a two sequence comparison. Our method stands on combining an adapted scoring scheme that includes the gaps and an approximate distribution of the ungapped local score of two independent sequences of i.i.d. random variables. The new scoring scheme is defined on h-tuples of the sequences, using the gapped global score. The influ...

متن کامل

Robust E-Values for Gapped Local Alignments

We examine a Poisson heuristic for judging the significance of local sequence alignments with gaps. Model parameters are estimated directly from the sequences to be aligned, so that laborious prior simulation studies or database comparisons for the estimation of parameters describing the connection between score and E-value are unnecessary. Simulation studies give evidence that this method give...

متن کامل

Nonresonant Excitation of the Forced Duffing Equation

We investigate the hard nonresonant excitation of the forced Duffing equation with a positive damping parameter E. Using the symbolic manipulation system MACSYMA, a computer algebra system. we derive the two term perturbation expansion by the method of multiple time scales. The resulting approximate solution is valid for small values of the coefficient e As the damping parameter e increases, th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of computational biology : a journal of computational molecular cell biology

دوره 8 5  شماره 

صفحات  -

تاریخ انتشار 2001